MAP Estimation of Whole-Word Acoustic Models with Dictionary Priors

نویسندگان

  • Keith Kintzley
  • Aren Jansen
  • Hynek Hermansky
چکیده

The intrinsic advantages of whole-word acoustic modeling are offset by the problem of data sparsity. To address this, we present several parametric approaches to estimating intra-word phonetic timing models under the assumption that relative timing is independent of word duration. We show evidence that the timing of phonetic events is well described by the Gaussian distribution. We explore the construction of models in the absence of keyword examples (dictionary-based), when keyword examples are abundant (Gaussian mixture models), and also present a Bayesian approach which unifies the two. Applying these techniques in a point process model keyword spotting framework, we demonstrate a 55% relative improvement in performance for models constructed from few examples.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Text-to-speech inspired duration modeling for improved whole-word acoustic models

In the construction of whole-word acoustic models, we have previously demonstrated substantial gains by using MAP estimation to introduce a simple prior model of phonetic timing. Based solely on the word’s phonetic (dictionary) pronunciation, this simple model included no information about the individual durations of constituent phones. However, the problem of modeling segmental duration has lo...

متن کامل

Developing 3 dimensional model for estimation of acoustic power in urban pathways in geo-spatial information system framework

Around the word, traffic growth is causing growing air and noise pollution. Noise levels in a given area are affected by traffic on the streets as well as effective factors, including existing infrastructure and industrial centers, and so on. The purpose of this research is to model and estimate the amount of acoustic emission in the streets of Tehran's third district, using the 3D spatial info...

متن کامل

Audio Scene Understanding using Topic Models

This paper introduces a method to apply the topic models in an audio scene understanding framework. Assuming that an audio signal consists of latent topics that generate acoustic words describing an audio scene, we propose to use a vector quantization method to build an acoustic word dictionary. The classification experiments with semantic labels yield promising results of using the topic model...

متن کامل

Fabricating conversational speech data with acoustic models: a program to examine model-data mismatch

We present a study of data simulated using acoustic models trained on Switchboard data, and then recognized using various Switchboard-trained acoustic models. When we recognize real Switchboard conversations, simple development models give a word error rate (WER) of about 47 percent. If instead we simulate the speech data using word transcriptions of the conversation, obtaining the pronunciatio...

متن کامل

Real-time spontaneous Ukrainian speech recognition system based on word acoustic composite models

This paper describes implementation of methods and algorithms for the automatic speech recognition based on word composition proceeding from acoustic phoneme models. Such a design of the speech-to-text decoder is conventional and most productive for Western languages. The aim is to explore this approach applied to the Ukrainian language that is highly inflective with relatively free word order....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012